Fine-grained bulge-chasing kernels for strongly scalable parallel QR algorithms

نویسندگان

  • Lars Karlsson
  • Bo Kågström
  • Eddie Wadbro
چکیده

The bulge-chasing kernel in the small-bulge multi-shift QR algorithm for the non-symmetric dense eigenvalue problem becomes a sequential bottleneck when the QR algorithm is run in parallel on a multicore platform with shared memory. The duration of each kernel invocation is short, but the critical path of the QR algorithm contains a long sequence of calls to the bulge-chasing kernel. We study the problem of parallelizing the bulge-chasing kernel itself across a handful of processor cores in order to reduce the execution time of the critical path. We propose and evaluate a sequence of four algorithms with varying degrees of complexity and verify that a pipelined algorithm with a slowly shifting block column distribution of the Hessenberg matrix is superior. The load-balancing problem is non-trivial and computational experiments show that the load-balancing scheme has a large impact on the overall performance. We propose two heuristics for the load-balancing problem and also an effective optimization method based on local search. Numerical experiments show that speed-ups are obtained for problems as small as 40 × 40 on two different multicore architectures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Qr Convergence Theory and Practice

The family of GR algorithms is discussed. This includes the standard and multishift QR and LR algorithms, the Hamiltonian QR algorithm, divide-and-conquer algorithms such as the matrix sign function method, and many others. Factors that aaect convergence are the quality of the transforming matrices and ratios of eigenvalues. Implicit implementations as bulge-chasing algorithms are discussed, as...

متن کامل

LAPACK Working Note # 216 : A novel parallel QR algorithm for hybrid distributed memory HPC systems ∗

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing (HPC) systems is presented. For this purpose, we introduce the concept of multi-window bulge chain chasing and parallelize aggressive early deflation. The multi-window approach ensures that most computations when chasing chains of bulges are performed ...

متن کامل

A novel parallel QR algorithm for hybrid distributed memory HPC systems

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing (HPC) systems is presented. For this purpose, we introduce the concept of multi-window bulge chain chasing and parallelize aggressive early deflation. The multi-window approach ensures that most computations when chasing chains of bulges are performed ...

متن کامل

Bulge Exchanges in Gz Algorithms for the Standard and Generalized Eigenvalue Problems

One of the most important methods for solving the generalized eigenvalue problem Av = λBv is the QZ algorithm, which is a member of the larger class of GZ algorithms. These are normally implemented implicitly as bulge chasing algorithms. We show how to effect rational implicit GZ iterations by chasing bulges in both directions and passing them through each other. Application of this procedure t...

متن کامل

Parallel Two-Stage Hessenberg Reduction using Tile Algorithms for Multicore Architectures

This paper describes a parallel Hessenberg reduction in the context of multicore architectures using tile algorithms. The Hessenberg reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenvalue problem. Although expensive, orthogonal transformations are accepted techniques and commonly used for this reduction because they guaran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Parallel Computing

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2014